Multisensory Vocal Communication and Its Neural Bases in Nonhuman Primates
نویسندگان
چکیده
Face-to-face communication is a temporally extended, multisensory process. In human speech, the auditory component is, for the most part, the result of vocal fold movements that release sound energy. This sound energy travels up through the oral and nasal cavities, where it is radiated out of the lips and nostrils. Changes in the internal and external shape of these cavities lead to different speech sounds as well as deformations of the external face around the oral aperture and other regions (Chandrasekaran, Trubanova, Stillittano, Caplier, & Ghazanfar, 2009; Yehia, Rubin, & VatikiotisBateson, 1998). This orofacial motion acts as a cue to what is being heard and is referred to as visual speech. Visual speech is not just an epiphenomenon of the vocal production process but is a critical component of face-to-face communication. Visual speech provides considerable intelligibility benefits to the perception of auditory speech (Ross, Saint-Amour, Leavitt, Javitt, & Foxe, 2007; Sumby & Pollack, 1954), is difficult to ignore (integrating readily and automatically with auditory speech; McGurk & MacDonald, 1976), and interacts with auditory speech even at the earliest stages of human cognitive development (Lewkowicz & Ghazanfar, 2009). Thus, multisensory speech seems to be the primary mode of speech perception and is not a capacity that is simply piggybacked onto auditory speech perception (Rosenblum, 2005). If the processing of multisensory signals forms the default mode of speech perception, then this should be reflected both in the evolution of vocal communication and the organization of neural processes related to communication. Naturally, any vertebrate organism (from fishes and frogs to birds and dogs) that can produce a vocalization will have a concomitant visual motion in the area of the mouth. In the primate lineage, the number of different vocalizations, and thus different patterns of facial motion, has increased during the course of evolution relative to other species. In light of this, it is not surprising that when nonhuman primates produce vocalizations, these utterances are accompanied by variety of visual cues encompassing a range of mouth, ear and head movements (Hauser, Evans, & Marler, 1993; Partan, 2002), as is the case for human speech production. Similarities between human and nonhuman primate vocal production imply that the perceptual and neural mechanisms underlying multisensory vocal perception and its evolution all could be illuminated by studying monkeys and other primates. The purpose of this review is (1) to briefly describe the data that reveal that human speech is not uniquely multisensory, that in fact the default mode of communication is multisensory in nonhuman primates (hereafter, primates) as well, and (2) to suggest that this mode of communication is reflected in the organization of the neocortex. We summarize the underlying structure of communication signals in primates, highlighting the similarities and differences with human speech. We then show that nonhuman primates display very similar strategies and behavioral patterns as humans in their perception of vocal communication signals. Finally, we review what we know about the neurophysiological bases of multisensory vocal communication. To conclude, we explore a mechanistic account to explain the neurophysiological integration of visual and auditory communication channels and how such integration could guide behavior.
منابع مشابه
The Default Mode of Primate Vocal Communication and Its Neural Correlates
It’s been argued that the integration of the visual and auditory channels during human speech perception is the default mode of speech processing (Rosenblum, 2005). That is, speech perception is not a capacity that is ‘piggybacked’ on to auditory-only speech perception. Visual information from the mouth and other parts of the face is used by all perceivers and readily integrates with auditory s...
متن کاملMultisensory integration of dynamic faces and voices in rhesus monkey auditory cortex.
In the social world, multiple sensory channels are used concurrently to facilitate communication. Among human and nonhuman primates, faces and voices are the primary means of transmitting social signals (Adolphs, 2003; Ghazanfar and Santos, 2004). Primates recognize the correspondence between species-specific facial and vocal expressions (Massaro, 1998; Ghazanfar and Logothetis, 2003; Izumi and...
متن کاملPrecursors to language: Social cognition and pragmatic inference in primates.
Despite their differences, human language and the vocal communication of nonhuman primates share many features. Both constitute forms of coordinated activity, rely on many shared neural mechanisms, and involve discrete, combinatorial cognition that includes rich pragmatic inference. These common features suggest that during evolution the ancestors of all modern primates faced similar social pro...
متن کاملThe evolution of language from social cognition.
Despite their differences, human language and the vocal communication of nonhuman primates share many features. Both constitute a form of joint action, rely on similar neural mechanisms, and involve discrete, combinatorial cognition. These shared features suggest that during evolution the ancestors of modern primates faced similar social problems and responded by evolving similar systems of per...
متن کاملPrimate vocal production and the riddle of language evolution
Trying to uncover the roots of human speech and language has been the premier motivation to study the signalling behaviour of nonhuman primates for several decades. Focussing on the question of whether we find evidence for linguistic reference in the production of nonhuman primate vocalizations, I will first discuss how the criteria used to diagnose referential signalling have changed over time...
متن کامل